已经证明,深层合奏将典型的集体学习中看到的积极效果扩展到神经网络和增强学习(RL)。但是,要提高此类整体模型的效率仍然有很多事情要做。在这项工作中,我们介绍了在RL(feft)中快速传输的各种合奏,这是一种基于合奏的新方法,用于在高度多模式环境中进行增强学习,并改善了转移到看不见的环境。该算法分为两个主要阶段:合奏成员的培训,以及合成成员的合成(或微调)成员,以在新环境中起作用。该算法的第一阶段涉及并行培训常规的政策梯度或参与者 - 批评者,但增加了鼓励这些政策彼此不同的损失。这会导致单个单峰剂探索最佳策略的空间,并捕获与单个参与者相比,捕获环境的多模式的更多。 DEFT的第二阶段涉及将组件策略综合为新的策略,该策略以两种方式之一在修改的环境中效果很好。为了评估DEFT的性能,我们从近端策略优化(PPO)算法的基本版本开始,并通过faft的修改将其扩展。我们的结果表明,预处理阶段可有效地在多模式环境中产生各种策略。除了替代方案,faft通常会收敛到高奖励的速度要快得多,例如随机初始化而无需faft和合奏成员的微调。虽然当然还有更多的工作来分析理论上的熟练并将其扩展为更强大,但我们认为,它为在环境中捕获多模式的框架提供了一个强大的框架,同时仍将使用简单策略表示的RL方法。
translated by 谷歌翻译
多养殖养殖具有环境优势,但比单一养殖需要更修剪。我们介绍用于自动修剪的新型硬件和算法。自主系统使用高架摄像头从物理规模的花园测试床中收集数据,利用学识渊博的植物表型卷积神经网络和边界磁盘跟踪算法来评估单个植物分布并每天估算花园的状态。从这个花园状态下,Alphagardensim选择植物自主修剪。训练有素的神经网络检测并靶向工厂上的特定修发点。实验评估了两种与农业机器人龙门系统兼容的定制设计的修剪工具,并通过受控算法进行了自主削减。我们提出了四个60天的花园周期的结果。结果表明,该系统可以自主实现0.94个归一化的植物多样性,并在修剪剪切的同时保持平均冠层覆盖率为0.84,到周期结束时。有关代码,视频和数据集,请参见https://sites.google.com/berkeley.edu/pruningpolyculture。
translated by 谷歌翻译
模拟到现实的转移已成为一种流行且非常成功的方法,用于培训各种任务的机器人控制政策。但是,确定在模拟中训练的政策何时准备将其转移到物理世界通常是一个挑战。部署经过很少的模拟数据训练的策略可能会导致物理硬件的不可靠和危险行为。另一方面,模拟中的过度训练会导致策略过度拟合模拟器的视觉外观和动力学。在这项工作中,我们研究了自动确定在模拟中训练的策略何时可以可靠地转移到物理机器人的策略。我们在机器人织物操纵的背景下专门研究了这些思想,因为成功建模织物的动力学和视觉外观的困难,成功的SIM2Real转移尤其具有挑战性。导致织物平滑任务表明我们的切换标准与实际的性能很好地相关。特别是,我们基于信心的切换标准在培训总预算的55-60%之内达到了87.2-93.7%的平均最终面料覆盖率。有关代码和补充材料,请参见https://tinyurl.com/lsc-case。
translated by 谷歌翻译
机器人舰队的商业和工业部署在处决期间通常会落在遥远的人类遥控者身上,当时机器人处于危险之中或无法取得任务进展。通过持续学习,随着时间的推移,从偏远人类的干预措施也可以用来改善机器人机队控制政策。一个核心问题是如何有效地将人类关注分配给单个机器人。先前的工作在单机器人的单人类设置中解决了这一点。我们正式化了交互式车队学习(IFL)设置,其中多个机器人可以交互查询并向多个人类主管学习。我们提出了一个完全实施的开源IFL基准套件,以评估IFL算法的GPU加速ISAAC健身环境。我们提出了Fleet-Dagger,这是一个IFL算法的家庭,并将一种新颖的Fleet Dagger算法与模拟中的4个基准进行了比较。我们还使用4个ABB Yumi机器人臂进行了1000个物理块式实验试验。实验表明,人类向机器人的分配显着影响机器人车队的性能,并且我们的算法比基线的算法获得了人类努力回报的8.8倍。有关代码,视频和补充材料,请参见https://tinyurl.com/fleet-dagger。
translated by 谷歌翻译
本文展示了alphaRARDEN:一个自治的多种植花园,在1.5米×3.0米的物理测试平台中撒上和灌溉生物植物。alphanArden使用架空相机和传感器来跟踪植物分布和土壤水分。我们模拟个体植物生长和平面动态,以培训选择行动以最大化叶片覆盖和多样性的政策。对于自主修剪,alphanarden使用两个定制的修剪工具和训练有素的神经网络来检测紫杉角。我们为四个60天的花园周期提供了结果。结果表明,alphaRARARDEN可以自主地实现0.96个归一化多样性,在循环峰值期间保持0.86的平均冠层覆盖率。可以在https://github.com/berkeleyautomation/alpharden找到代码,数据集和补充材料。
translated by 谷歌翻译
Embedding words in vector space is a fundamental first step in state-of-the-art natural language processing (NLP). Typical NLP solutions employ pre-defined vector representations to improve generalization by co-locating similar words in vector space. For instance, Word2Vec is a self-supervised predictive model that captures the context of words using a neural network. Similarly, GLoVe is a popular unsupervised model incorporating corpus-wide word co-occurrence statistics. Such word embedding has significantly boosted important NLP tasks, including sentiment analysis, document classification, and machine translation. However, the embeddings are dense floating-point vectors, making them expensive to compute and difficult to interpret. In this paper, we instead propose to represent the semantics of words with a few defining words that are related using propositional logic. To produce such logical embeddings, we introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised. The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee," thus being human-understandable. We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks. Furthermore, we investigate the interpretability of our embedding using the logical representations acquired during training. We also visualize word clusters in vector space, demonstrating how our logical embedding co-locate similar words.
translated by 谷歌翻译
Large training data and expensive model tweaking are standard features of deep learning for images. As a result, data owners often utilize cloud resources to develop large-scale complex models, which raises privacy concerns. Existing solutions are either too expensive to be practical or do not sufficiently protect the confidentiality of data and models. In this paper, we study and compare novel \emph{image disguising} mechanisms, DisguisedNets and InstaHide, aiming to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data. DisguisedNets are novel combinations of image blocktization, block-level random permutation, and two block-level secure transformations: random multidimensional projection (RMT) and AES pixel-level encryption (AES). InstaHide is an image mixup and random pixel flipping technique \cite{huang20}. We have analyzed and evaluated them under a multi-level threat model. RMT provides a better security guarantee than InstaHide, under the Level-1 adversarial knowledge with well-preserved model quality. In contrast, AES provides a security guarantee under the Level-2 adversarial knowledge, but it may affect model quality more. The unique features of image disguising also help us to protect models from model-targeted attacks. We have done an extensive experimental evaluation to understand how these methods work in different settings for different datasets.
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
When testing conditions differ from those represented in training data, so-called out-of-distribution (OOD) inputs can mar the reliability of black-box learned components in the modern robot autonomy stack. Therefore, coping with OOD data is an important challenge on the path towards trustworthy learning-enabled open-world autonomy. In this paper, we aim to demystify the topic of OOD data and its associated challenges in the context of data-driven robotic systems, drawing connections to emerging paradigms in the ML community that study the effect of OOD data on learned models in isolation. We argue that as roboticists, we should reason about the overall system-level competence of a robot as it performs tasks in OOD conditions. We highlight key research questions around this system-level view of OOD problems to guide future research toward safe and reliable learning-enabled autonomy.
translated by 谷歌翻译
Tsetlin Machine (TM) has been gaining popularity as an inherently interpretable machine leaning method that is able to achieve promising performance with low computational complexity on a variety of applications. The interpretability and the low computational complexity of the TM are inherited from the Boolean expressions for representing various sub-patterns. Although possessing favorable properties, TM has not been the go-to method for AI applications, mainly due to its conceptual and theoretical differences compared with perceptrons and neural networks, which are more widely known and well understood. In this paper, we provide detailed insights for the operational concept of the TM, and try to bridge the gap in the theoretical understanding between the perceptron and the TM. More specifically, we study the operational concept of the TM following the analytical structure of perceptrons, showing the resemblance between the perceptrons and the TM. Through the analysis, we indicated that the TM's weight update can be considered as a special case of the gradient weight update. We also perform an empirical analysis of TM by showing the flexibility in determining the clause length, visualization of decision boundaries and obtaining interpretable boolean expressions from TM. In addition, we also discuss the advantages of TM in terms of its structure and its ability to solve more complex problems.
translated by 谷歌翻译